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ABSTRACT 

An iterative design algorithm for the joint design of 
complexity- and entropy-constrained subband quantiz- 
ers and associated entropy coders is proposed. Un- 
like conventional subband design algorithms, the pro- 
posed algorithm does not require the use of various bit 
allocation algorithms. Multistage residual quantizers 
are employed here because they provide greater con- 
trol of the complexity-performance tradeoffs, and also 
because they allow efficient and effective high-order sta- 
tistical modeling. The resulting subband coder exploits 
statistical dependencies within subbands, across sub- 
bands, and across stages, mainly through complexity- 
constrained high-order entropy coding. Experimental 
results demonstrate that the complexity-rate-distortion 
performance of the new subband coder is exceptional. 

X. Introduction 

The conventional approach to subband image coding 
has been to design separate optimal or near-optimal 
quantizers and associated entropy coders for each of 
the subband images. A bit allocation algorithm is then 
used to distribute bits among the subbands [1]. 

The subband image coder proposed here is different 
in that the subband quantizers and associated entropy 
coders are optimized jointly within and across the sub- 
bands in a complexity- and entropy-constrained frame- 
work. The algorithm used to design the coder employs 
multistage residual vector quantizers [2] which feature 
very low complexity and memory, and provide much 
greater control over both design and encoding com- 
plexity. An important aspect of the algorithm is that 
no explicit bit allocation is needed. The bits are in- 
directly (but optimally) allocated among the subbands 
during the design process. The new coder exploits both 
statistical intra-band and inter-band dependencies si- 
multaneously, mainly through complexity-constrained 
high-order conditional entropy coding, as discussed in 
the remainder of this paper. 

This work was supported in part by the National Sci- 
ence Foundation under contract MIP-9116113 and the Na- 
tional Aeronautics and Space Administration. 



Figure 1: Basic block diagram of the subband encoder 

2. System Description 

As shown in Figure 1, the input signal is first decom- 
posed into M subbands using an analysis transfor- 
mation. Each subband is then encoded using a se- 
quence of P m (1 < m < M) residual vector quanti- 
zation (RVQ) fixed length encoders. Any vector quan- 
tization (VQ) encoder can be used, although we find 
RVQs to achieve excellent rate-distortion-complexity 
performance [3]. The output symbol for each stage 
vector quantizer is fed into an entropy coder driven by 
a high-order stage statistical model that is controlled 
by a finite state machine (FSM). The FSM allows the 
statistical model to switch among several zero-order 
conditional models (represented by first order proba- 
bilities) based on the state transitions. In this work, 
a nonlinear mapping F given by u = F(s\, S 2 , . . • , s n ), 
where s\ , S 2 , ■ ■ ■ , s n are n previously coded symbols or 
outputs of some stage quantizers, will be used to de- 
termine the conditioning state u. How to construct the 
best mapping F will be described in Section 4. Finally, 
the output bits of the entropy coders are combined to- 
gether and sent to the channel. Since only previously 
coded symbols are used by the FSM, no side informa- 
tion is necessary and the decoder can track the state of 
the encoder. 






3. The Design Algorithm 


Given fixed analysis/synthesis transformations, the pro- 
posed design algorithm minimizes the expected distor- 


tion E 


d(X,X)], 


where X is the input and X is the 


output, subject to a constraint on the overall entropy 
of the product of the M subband VQs. This design 
algorithm is an iterative descent algorithm based on a 
Lagrangian minimization, and is a generalization of the 
entropy-constrained algorithms described in [4, 2, 5]. 
Given a fixed Lagrangian parameter A, the algorithm 
attempts to satisfy simultaneously optimality condi- 
tions, requiring the subband encoders, decoders, and 
entropy coders.be designed jointly. Details of the op- 
timality conditions and convergence of the algorithm 
can be found in [3]. 

The parameter A is chosen based on the overall 
rate and distortion of the subbands, and is used in the 
entropy-constrained design of all subband quantizers. 
Therefore, explicit bit allocation is not needed in the 
design process. In fact, it can be shown [3, 6] that main- 
taining the same slope A for all subband operational 
R(D) curves results in a locally optimal allocation of 
bits. 


4. Complexity Issues 

The complexity and memory associated with the de- 
sign algorithm grows exponentially as a function of the 
quantization and entropy coding parameters. To re- 
duce them substantially, constrained VQs must gen- 
erally be employed. In this work, we choose to use 
multistage residual vector quantizers, mainly because 
they require relatively low encoding complexity and 
memory, and because they simplify the design pro- 
cess by providing greater control over the complexity- 
performance tradeoffs. 

As. shown in [3], optimal encoding requires that the 
synthesis transformation be embedded in the design 
loop, which can result in much larger computational 
and memory requirements. However, experimental re- 
sults show that the subbands can be encoded accu- 
rately by minimizing the distortion between the input 
and the output of the subband quantizers (indepen- 
dently) instead of minimizing the overall distortion of 
the analysis-quantizer-synthesis system. 

Exhaustive searching, which is generally necessary 
for optimal encoding, can be circumvented effectively 
by exploiting the multistage residual structure and us- 
ing tree-structured searching techniques such as the 
(. M,L ) algorithm. Such algorithms can be used to 
search the stage codebooks in all subbands with rea- 
sonable computational requirements. In particular, we 
choose to use the dynamic M - search algorithm [7], which 
employs a thresholding technique to decide the best 
number of paths that should be saved at each RVQ 
stage. Dynamic Af-search provides a flexible way of 



Figure 2: Inter-stage, inter-band, and intra-band con- 
ditioning scheme within an image. 

trading complexity for performance, and it can achieve 
rate-distortion performance that is very close to that 
of exhaustive searching while requiring only 20%-50% 
more computations than sequential searching. 

Optimal decoding requires a 2-dimensional opti- 
mization procedure (see Figure 1) which consists of 
using the iterative Gauss-Seidel algorithm [5] to mini- 
mize the average distortion between the input and the 
synthesized reproduction of all stage codebooks in all 
subbands. Although the joint decoding optimization is 
potentially more demanding than that associated with 
non- constrained quantizers, its complexity can be dras- 
tically reduced by, for example, grouping neighboring 
stage codebooks in neighboring subbands and jointly 
optimizing each group independently. This typically 
results in less than a 0.10 dB loss in PSNR. 

The most important advantage of the multistage 
residual structure is that it can substantially reduce 
the complexity and memory by making the output al- 
phabet of the stage quantizers small (e.g., 2, 3, or 4). 
For example, a 4-stage RVQ with 4 code vectors per 
stage codebook using a second-order conditional en- 
tropy coder for each stage, generally requires that 64 
probabilities per stage be computed and stored. For 
a single-stage conventional VQ with 256 code vectors, 
as many as 256 3 probabilities may need to be com- 
puted and stored. Multistage RVQs provide another 
dimension upon which to capitalize. To illustrate this 
point, Figure 2 shows the inter-stage, inter-band, and 
intra-band conditioning scheme used in the system. 
Each image shown is a multistage approximation of 
the input image, and statistical dependencies among 
these images generally exist. For each stage ( m,p ) 
in each subband m, a 5-dimensional initial region of 
support 7 Zm iP containing a sufficiently large number 
Rm,p of conditioning symbols, or previous outputs of 



fixed-length stage encoders, is first chosen. Since using 
the region of support 1Z m , P in the conditioning pro- 
cess generally results in unbearable complexity, we lo- 
cate the n m>p , n m>p << R m ,p, conditioning symbols 
s 1 ,...,s nm ’ p such that the rc m , p th order conditional 
entropy H(J m)P Is 1 , . . . , s nm ’ p ) is minimized. To do this 
we build a tree where the levels represent the orders (or 
number of conditioning symbols) and the branches rep- 
resent the possible combinations of conditioning sym- 
bols at each level. As is described in [8], this tree is 
symmetric, which can simplify the search process. Our 
experimental results show that dynamic M-search pro- 
vides an excellent balance between conditional entropy 
and search complexity. 

Since the objective is to minimize the average en- 
tropy of all stage statistical models in all subbands 
given a fixed level of complexity and memory of the 
joint entropy coder, our approach is to first build a tree 
with )Cm=i -Pm branches, where P m is the number of 
stage codebooks in the mth subband. Each branch is a 
unary tree of length L mtP , where L mjP is the number of 
complexity-entropy pairs. The dynamic M-search al- 
gorithm is used to find the best n m>p (1 < n miP < L mjP ) 
conditioning symbols given R p<m conditioning symbols. 
For each complexity-entropy pair, complexity is given 
by Mm,p - Sm, P Nm,p, where 5 m , p is the number of all 
combinations of realizations of the conditioning sym- 
bols and 7V m>p is the output alphabet size of stage p 
in band m. Once all complexity-entropy pairs are ob- 
tained, we then use the generalized BFOS algorithm 
[9] to minimize the overall output entropy subject to 
a constraint J\f max on the total number of conditional 
probabilities, which is used here as a measure of com- 
plexity and memory. 

The FSM statistical model for each stage ( m,p ) 
employs a mapping F to determine the state given 
n miP available symbols. The mapping F is one-to-one 
and is actually given by a table that contains the num- 
bers 0,1,..., S m> p - 1, representing each of the possible 
combinations. As discussed in [10], a large number of 
the tables representing the S m ,p states are usually ei- 
ther not populated or scarcely populated. In addition 
to this inefficiency, some of the empty tables may be 
visited during actual encoding even though they were 
never visited during the design process. This is the 
so-called empty state problem that often arises in FSM 
design. To address this, we use the PNN algorithm [11], 
as described in [12], to drastically reduce the number 
of states while still bounding the loss in entropy per- 
formance to 1%. The PNN algorithm used here first 
merges all of the empty states with the least probable 
state into one conditioning state, thereby completely 
removing empty states. Then, the two conditioning 
states resulting in the lowest increase in entropy (when 
merged) are combined into one conditioning state, and 
so on until only one state, which represents one table 


of first-order probabilities, is obtained. As described 
above, another tree with Y^m-\ branches is built, 
where the nodes now represent complexity-entropy pairs 
obtained by the PNN algorithm. The BFOS algorithm 
is again used to minimize the overall output entropy 
subject to the use of a much smaller number of condi- 
tional probabilities. 

Quantizing the conditioning states has the addi- 
tional advantage that the stage statistical model orders 
can be allowed to grow to relatively large numbers, 
which tends to lower the overall entropy with only a 
small increase in encoding complexity. Moreover, the 
merging process improves the robustness of the sub- 
band coder because only global statistics are carried 
through, and the possibility of a strong mismatch be- 
tween the test sequence and the coder is less likely. 

5. Experimental Results 

Several images of size 512 x 512 (excluding the test im- 
ages) were used to design jointly optimized complexity- 
and entropy-constrained subband residual scalar quan- 
tization (RSQ) codebooks (i.e. the vector size is 1 x 1). 
In the proposed framework, the use of larger vector 
sizes resulted in a large increase in complexity but no 
increase in rate-distortion performance. 

For analysis/synthesis, we employ a 3-level bal- 
anced tree-structured IIR allpass polyphase filter bank 
as described in [13], resulting in 64 uniform subbands. 
For this implementation, all stage codebooks in all sub- 
bands contain 3 scalars. The initial maximum allowed 
number of conditional probabilities N max is set to 2048. 
After using the BFOS algorithm, we employed the PNN 
algorithm to populate another tree as described be- 
fore. The BFOS algorithm is used again to locate the 
best numbers of conditioning states for each stage in 
each subband subject to using a maximum number 
of 256 probabilities. The output of each of the stage 
fixed-length RSQ encoders is encoded using an adap- 
tive arithmetic coder driven by the corresponding FSM 
statistical model just generated. 

For each rate- distortion point, the total memory 
required to store the subband RSQ codebooks and as- 
sociated mapping tables as well as tables of conditional 
probabilities is approximately 1.5 kilobytes. For en- 
coding (including analysis) using dynamic M-search, 
approximately 8 multiplies and 11 adds per pixel are 
required. Only 3 multiplies and 9 adds are required for 
decoding (including synthesis). In this example, the 
design time is approximately 4 CPU hours on a Sun 
Sparc 10 workstation. Not only are the complexity 
and memory relatively small, but the performance is 
exceptional. For instance, the PSNRs obtained for the 
image LENA at 0.50 and 0.25 bpp are approximately 
37 and 34.1 dB, respectively. Moreover, the subjective 
quality is also very good. Figure 3 shows the coding 
result of the image LENA at a bit rate of 0.25 bpp using 
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Figure 3: The LENA image coded using the iir/rsq 
SUBBAND CODER. The bit rate is 0.25 bpp and the 
PSNR is 34.07 dB. 

the subband coder. This is almost 4 dB better than 
the JPEG standard at the same rate. 

These experimental results show that the proposed 
subband coder can achieve very good compression re- 
sults while maintaining relatively low complexity and 
memory. Moreover, it also compares favorably with 
standard JPEG in terms of complexity. The proposed 
coder does require more encoding/decoding computa- 
tions, but multiplication-free implementations of it [14] 
can achieve complexity very close to that of JPEG with 
only a small loss (less than 0.50 dB) in quality. 
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